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High-dimensional chaos displayed by multi-component systems with a 
single time-delayed feedback is shown to be accessible to time series analysis 
of a scalar variable only. The mapping of the original dynamics onto scalar 
time-delay systems defined on sufficiently high dimensional spaces is thor- 
oughly discussed. The dimension of the "embedding" space turns out to be 
independent of the delay time and thus of the dimensionality of the attractor 
dynamics. As a consequence, the procedure described in the present paper 
turns out to be definitely advantageous with respect to the standard "em- 
bedding" technique in the case of high-dimensional chaos, when the latter is 
practically unapplicable. The mapping is not exact when delayed maps are 
used to reproduce the dynamics of time-continuous systems, but the errors can 
be kept under control. In this context, the approximation of delay-differential 
equations is discussed with reference to different classes of maps. Appropri- 
ate tools to estimate the a priori unknown delay time and the number of 
hidden components are introduced. The generalized Mackey-Glass system is 
investigated in detail as a testing ground for the theoretical considerations. 



I. INTRODUCTION 



Complex time dependence in laboratory systems, in our natural environment or in living 
beings can have a variety of origins. One of the most fascinating perspectives is represented 
by the description of aperiodic fluctuations in terms of deterministic dynamical models. 
In the last two decades, much work has been devoted to test for this hypothesis and to 
characterise the underlying dynamics under the assumption that only a scalar time-series is 
available. Since the pioneering articles of Packard et al. IB], Takens M, and Grassberger & 
Procaccia |§, a sound body of knowledge has been progressively acquired f§], leading to the 
establishment of a new discipline, the nonlinear time series analysis. The general approach 
consists in reconstructing the phase space from the observed scalar data, most often by 
making use of the time delay embedding. In a sequence of spaces of increasing dimension, 
one looks for the manifestation of deterministic structures such as finite attractor dimension 
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or enhanced predictability. Unfortunately, this approach suffers from severe limitations as 
soon as the dynamical complexity of the underlying dynamics becomes relatively large. 

Systems with time delayed feedback can create arbitrarily complex dynamics already 
with very few variables and rather simple equations of motion. The Mackey-Glass equation 
H is the best known such example. It is a first order scalar differential equation with a 
force field that depends on a past value of the variable itself. This model was suggested in a 
physiological context (regulation of the production of red blood cells), where the mechanism 
of time delayed feedback is rather common. Further examples range from such widespread 
scientific disciplines as biology, epidemology, physiology, or control theory ||]7J. In physics, 
this class of systems has been largely ignored, although time delayed feedback has been 
introduced in several laboratory experiments as an additional means to enhance chaotic 
properties of systems, as e.g. in the CO2 laser experiment performed in Ref. ||. From 
the mathematical point of view, time delayed feedback leads to delay-differential equations 
(see || for some results about the existence and uniqueness of solutions of the initial value 
problem). The corresponding phase space is infinite dimensional, as the initial condition is a 
generic function defined on the interval [—To, 0], with To being the delay time of the feedback 
loop. In practice, however, high frequency components are almost absent and thus a finite 
number of variables suffices to parametrize the asymptotic solutions. On the other hand, 
the fractal dimension D can be made arbitrarily large as it has been established that D is 
proportional to r for sufficiently large r f9j,[lC| . 

As already mentioned, the direct reconstruction of attractors from scalar data through 
time delay embedding using Takens theorem is clearly limited to low dimensional objects. A 
recent estimate JTTJ which takes entropy-related folding effects of the embedding procedure 
into account, shows that the minimal number of points N required for a clear manifestation 
of determinism must be larger than Ve hD s D , where s is the required scaling range (e.g. 
s = 10 represents one decade of scaling) and h is the Kolmogorov-Sinai entropy. In practice, 
attractors with dimensions larger than 5 can hardly be identified by time series analysis using 
Takens theorem, since otherwise an unrealistic large amount of data and an unrealistically 
low noise-level would be required. 

High dimensional attractors of systems with time delayed feedback are thus practically 
indistinguishable from colored noise. On the other hand, the underlying delay-differential 
equation couples only a few variables, so that it is natural to ask whether more effective 
techniques exist, which are able to reproduce the observed dynamics. It turns out that a 
reconstruction, not of the attractor in a proper phase-space, but of the dynamical rule in 
what we call "state" space, is often easier and equally effective. On that basis, the delay times 
of unknown scalar systems from a time series with the help of appropriate indicators |jl2",I3 
were estimated. Later, it was shown that the dynamical rule itself can be reconstructed from 
the time series of scalar time delay systems ][l4H l7| and thereby the Lyapunov spectrum 



Tq] . Most importantly, the dimension of the state space does not depend on the delay time, 
opening up the possibility to model and characterize high-dimensional regimes as well. 

Since the restriction to scalar time-delay systems is, in practice, too severe, some efforts 
have been made to extend the latter ideas to the case of multi-variate time-delay systems. 
On a phenomenological basis, it was demonstrated that the delay time can be estimated 



also in such systems by treating the system analogously to a scalar one [IS]. For a multi 



variate delay system with a single time-delayed feedback, an embedding-like theorem for 
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delay systems was derived and applied to experimental data from a laser p0| . For this end, 



an extension of Takens theorem to input-output systems as conjectured by Casdagli |21|j22 
and later proven in [^| , was applied to time delayed systems. For the general case of multi- 
variate delay systems (with multi-variate delays) until now a multi-variate measurement is 
required |24 . 

In this paper, we discuss in depth the theoretical aspects of the identification of a suitable 
state space for time delay feedback systems. We shall first consider the problem of mapping 
the original dynamics (possibly characterized by several variables) onto scalar models under 
the only restriction of a single feedback process. In the first part of Sec. II, the discussion is 
carried on for discrete-time models. The result is then extended to continuous-time models. 
In particular, we show that the reconstruction is possible both when the recorded variable is 
and when it is not the feedback variable. The only (important) difference between the two 
cases concerns the minimum dimension of the state-space such that a faithful reconstruction 
is possible: The minimum dimension turns out to be definitely larger in the latter case. In 
Sec. Ill, we discuss the approximations involved in the modellization of continuous-time 
dynamics in terms of delayed maps. 

A thorough discussion of the various difficulties encountered in the practical implementa- 
tion of these theoretical ideas is then presented in Sec. IV with reference to the generalized 
Mackey-Glass model: a differential delay equation involving two variables. Problems like 
the determination of the delay-time and the intrinsic limitations of local indicators are in- 
vestigated therein. Sec. V is devoted to the discussion of global indicators, while the open 
problems are briefly reviewed in Sec. VI. In the second part of this paper, these concepts 
will be employed and illustrated in the case of experimental data taken from a CO2 laser 
with feedback. 



II. EMBEDDING THEORY 

In this subsection, we introduce multicomponent systems with delayed feedback and dis- 
cuss the possibility to map them onto suitable scalar models. Besides addressing a general 
mathematical question (i.e. the equivalence between different classes of dynamical equa- 
tions), our motivation resides in the possibility to reconstruct the dynamics of a delayed 
system from a single scalar variable. 

As anticipated in the introduction, we shall refer to a general case with d variables. The 
only restriction that we impose concerns the number of feedback processes: we shall assume 
that only one variable is fed back. We believe that this is a sufficiently general standpoint 
to begin a meaningful study of delayed systems. 

Although the physically meaningful models are continuous-time systems, it is worth 
considering also delayed maps (DM), since the way DDEs are implemented on digital com- 
puters is precisely by constructing a suitable DM and, more important, DMs can be studied 
more efficiently to extract the relevant physical properties from experimental signals. More 
precisely, we shall also consider the generic (i-component DM 

y(n + l) = F(y(n), yi {n-T )), (1) 

where y n G lZ d and the delay time tq is a positive integer number. The initial condition of 
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the DM consists of a (d + tq) -component vector, so that the phase space is TZ d+T0 . Again 
without loss of generality, the feedback variable is assumed to be the first component. 

With reference to discrete-time systems, we now discuss the question of reconstructing 
the dynamics of a given component in terms of the values of the same component at different 
times. The embedding theorems P,pTf tell us that the knowledge of sufficiently many values 
of yk(n) for a given k (the chosen component) suffices to reconstruct the dynamics on the 
attractor. More precisely, it is possible to express the value of yk(n+ 1) as a function of its 
2D previous values, where D is the attractor dimension. The point we want to address here 
is the possibility to reconstruct the dynamics with much less variables than required by the 
embedding theorem. 

We start from the simple assumption of a linear dynamical system 

y(n + 1) = Ay(n) + dyi{n - r ) (2) 

where A is a d x d matrix and a is a ^-component vector. Next we need to specify which 
variable is actually recorded; the structure itself of the above equation reveals a difference 
between the first variable (the only one being fed-back) and all the others. We will see that 
such a difference plays an important role in the construction of an optimal model. 

We first consider the case of the variable y\ being recorded. The problem we want to 
discuss is that of finding the minimum amount of information to determine yi(n + 1), when 
the only available information consists in the past values of y\ itself. All components of 
y are, in principle, necessary but we shall see that they are implicitely determined so as 
to make possible a truly deterministic reconstruction of the dynamical rule only from the 
knowledge of y±. Therefore, we define all components of y except y\ as "hidden variables", 
i.e. unknowns that must be determined in order to be explicitely eliminated from the final 
dynamical law. 

We will now discuss the problem of the information needed to construct a model in a 
pictorial way, by referring to Fig. [I]. We hope that this will be clear enough to be easily 
followed without the need to enter technicalities. Each row in Fig. [l| is a schematic description 
of the information involved in the application of Eq. (|2|) at a specific time. A full square 
positioned in the site n of the time-lattice indicates that all the d — 1 hidden variables at 
time n are required in the iteration of the DM. Let us start from the uppermost row. As 
we have to determine only yi at time n + 1 (see the question mark in the figure) we need to 
consider only one equation which, in general will however depend on all variables at time n 
(see the corresponding full square at time n) and on yi at time n — T (see the triangle). As a 
result, we have {d— 1) (the full square) plus 1 (the question mark) unknowns. This number 
is reported in column A on the right of the figure. The net difference between the number 
of available equations and that of unknowns is instead reported in column B. We see that, 
in this case, since we have considered just one equation, such a difference is precisely 1 — d. 
Accordingly we reach the trivial conclusion that we cannot determine yi(n + 1) from the 
knowledge of only yi(n) and yi(n — tq). 
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FIG. 1. Illustration of the coupling of the variables in the case where the embedding aims at 
the elimination of all the variables except the one with feedback. Full squares denote the hidden 
variables. Open triangles denote the the variable with the time delayed feedback, which is accessible 
to measurement. 



Further information can be obtained from the past history. Let us start discussing the 
application of the dynamical law at the previous time step (see the second row in Fig [I]). In 
this case, all the d equations enter into play. The number of unknown variables is (d — 1), i.e. 
the square at time n— 1, while the difference between new equations (d) and new unknowns is 
1. Therefore, the addition of this new step allows reducing the global gap between unknowns 
and equations. Iterating this procedure d — 1 times will eventually allow to reach a break- 
even point, when the number of equations is equal to the number of unknowns. This means 
that we have to consider d rows in Fig. [T], i.e. that yi(n + 1) is unambiguously determined 
once i/i itself is known in at least two windows of length d. Formally, we find that for 
m > m d Q) with raj = d, 

yi(n+ 1) = (3v(n;m,T ), (3) 

where 

v(n; m, r ) = (yi{n), y x {n - 1), . . . , y^n - m + 1); 

Vi{n - r ),yx(n - r - 1), . . .,yi{n - r - m + (4) 

an expression stating that we have been able to transform the initial multicomponent DM 
(@) into a scalar equation @. The price we had to pay is that now the dependence on the 
past values of y± is not restricted to a single value as originally assumed in Eq. (g) , but d 
consecutive values are needed. 

If To < d, the 2c? variables appearing on the l.h.s. of Eq. ([3]) overdetermine yi(n + 1), 
since in the above described process some unknowns are counted twice. As we have in mind 
applications to models with a few components compared to the delay, we shall not argue 
further about this point. Moreover, it is instructive to see that the dimension of the phase- 
space is T + d in the reconstructed model as well as in the original one: Iteration of Eq. @ 



x For the remainder of this paper we will term rrid the minimal "window size" of the model @ 
that guarantees a proper embedding. 
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indeed requires knowing y\(l) in the whole range n > I > n — tq — d. Accordingly, model 
Eq. (|3]) provides a faithful reconstruction of the whole dynamics including the convergence 
to the asymptotic attractor. This is to be contrasted with the possibility offered by the 
standard embedding technique to describe only the dynamics on the attractor itself. 

The advantage over the standard application of the embedding theorem becomes more 
transparent if we also notice that the number of variables needed to reconstruct the dynamics 
is 2d, independently of the delay To, i.e. independently of the phase-space dimension r + d 
that can be arbitrarily large. In particular, the technique can be equally effective also in 
the high dimensional regimes generally existing whenever r ^> 1 (let us recall that the 
dimension of the attractor is proportional to the delay). 

In the case of a nonlinear DM ([I]), the basic difference is that the function F is, in 
general, non-invertible. This implies that longer sequences of variables must be considered 
to remove the ambiguities inherent to the lack of invertibility. In analogy to the embedding 
theorem, it is natural to conjecture that the model equation 

y x (n+ 1) = f(v(n;m,T )), (5) 

with two windows of length m > m^, and d < < 2d + 1, suffices to faithfully reconstruct 
the dynamics even in the worst case. This conjecture is indeed confirmed, if we interpret the 
delayed feedback as an "external" driving and thus see the whole system as an input-output 



system like those considered by [22]. This analogy, suggested in [20], allows referring to the 



generalization of embedding theorems reported in Ref. p3[ , which precisely indicate that 
2(2d+ 1) is a true upper bound for the number of variables to be actually used in the model 
reconstruction. 

The feedback variable is certainly peculiar and different from all other variables involved 
in the dynamical process. It is therefore, interesting to ask oneself whether a compact recon- 
struction of the model is still possible if not the feedback variable is measured, but any other 
variable. The answer is yes, but the number of variables to acquire the necessary information 
is larger than before and the proof is also rather cumbersome so that a pictorial represention 
such as the one reported in Fig. |] will be very helpful. In this case, we must distinguish 
among three types of variables: (d — 2) hidden variables without feedback (represented by a 
full square); the hidden feed-back variable (cross) and the variable experimentally observed 
(open triangle). In the first step of the procedure, there is one more unknown variable than 
before (since yi(n — r ) is unknown, too) so that the gap between variables and unknowns is 
— d. Equally more negative is the second step, since the existence of an additional variable 
(the feed-back which is not recorded) prevents having a net gain. Therefore, recursively 
repeating the very same step does not allow removing all the unknowns. Nevertheless, we 
can still find a meaningful solution by modifying our strategy as described in the third step, 
where we consider the application of the mapping at time n — r — 1. After comparing the 
newly involved variables with those already introduced in the two previous time steps, one 
sees that the additional gap is equal to 3 — d. This result is strictly positive only if d < 2, 
thus suggesting that this new strategy leads in general to worse results. However, from now 
on, one can alternate steps of the previous and new type (see. e.g., the fourth and the 
fifth line in Fig. |2|): this allows gaining 1 equation every second step. The break-even point 
is obtained after 2d — 1 steps. This means that the recorded variable must be known in 
two windows of length 2c? — 1. Accordinlgy, the price to be payed for not dealing with the 
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feed-back variable is that the number of "variables" is almost twice as large as before. Nev- 
ertheless, we can still consider this last result as positive, since the dimension of the space is 
still independent of the delay. An important difference with the previous case concerns the 
phase-space dimension. The iteration of the reconstructed model requires now to know a 
single variable over To + 2d — 1 consecutive times, a number larger than the initial dimension 
To + d if d > 1. This means that our procedure has enlarged the phase-space dimension, 
introducing some spurious directions. We want to show now that the price for keeping the 
dimension of the phase-space equal to the original value is the construction of a much more 
complicated model. In fact, with reference again to Fig. we see that the steps of type 1 
do not allow any gain only until we arrive at time n — r . However, from that point the 
number of unknowns reduces by unit per single step since the variable y\ was already taken 
into account, so that we eventually do not need to go beyond time n — t — d. However, 
in doing so, all variables in the entire delay time are included, i.e. the standard embedding 
approach has been followed. 

More in general, in the case of nonlinear systems, we expect that a model 

y 2 (n + 1) = f(v(n;m,T )), (6) 

exists for m > m^d < < Ad — 1. However, it is honest to recognize that one will be 
hardly able to go beyond d = 2 in practical cases. 
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FIG. 2. Illustration of the coupling of the variables in the case where the embedding aims at the 
elimination of the variable with feedback. Hidden variables without feedback are denoted by full 
squares. The hidden variable with feedback is denoted by crosses. The experimentally observed 
variable is denoted by open triangles. 



As in the standard embedding technique, the presence of nonlinearities with the possible 
noninvertibility of some functions might require doubling the number of variables necessary 
for a faithful reconstruction of the dynamics. Note that the number of spurious directions 
introduced by the DM-model in the nonlinear case is at most 3d — 1, and therefore much 
smaller compared to the number of spurious directions introduced by a Takens-type model, 
which can be up to tq + d + 1. 
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In the case of time-continuous models, we refer to first order delay-differential equation 
(DDE) of the type 



x = H(x,xi(t -r )), (7) 

where, without loss of generality, the feedback variable is assumed to be the first component 
X\ of the d- dimensional vector x, while To G 1Z + is the delay time. The initial condition 
for the DDE consists in a differentiable function in the interval [to — To, to] P^ us a (d — 1)- 
dimensional vector (i.e. the remaining components) at time to- Therefore, the phase space 
is C^Ojro] x 1Z d -\ 

For time-continuous models (0), basically the same procedure applies, except that instead 
of including the dependence on additional times in the past as in Eq. (|j), one has to add 
higher-order time derivatives. The final results are: 

(1) a scalar DDE equation for the variable X\ of m-th order (with m > and d < rrid < 
2d+l) 



xt ] =h(w(m,r )), (8) 



where 



(1) (m-l) 

w(m,T ) = [x 1 ,x\',...,x\ '; 



X X {t - T ),X?(t - r ), . . . ,4 m-1) (* " To)). (9) 



We write for the i-th derivative of the variable x\ with respect to time. 

(2) a scalar DDE equation for the variable X2 of m-th order (with m > rrid and d < rrid < 
Ad-1) 



4 m) = h(tf{m,T )) t (10) 

->, x / (1) (m-l) 

w(m,T ) = [x 2 ,x\ x\ 

x 2 (t - r ), \t - r ), ... , 4 m " 1} (t - r )) • (11) 



where 



III. FROM CONTINUOUS TO DISCRETE TIME 

In the previous section we have seen that vectorial delay-models can be mapped onto 
scalar ones by embedding the attractors into suitable state spaces defined in terms of a single 
variable recorded in two windows of length m^. These results provide the minimal framework 
for reconstructing the dynamics starting from the knowledge of just one observable. However, 
the exact inference of the model is formally possible only when a DDE (DM) dynamics is 
reconstructed in terms of a continuous (discrete) time model. 



S 



In reality, almost all physically meaningful processes stem from continuous time equa- 
tions, while data are typically accessible as sequences of values sampled with a finite fre- 
quency. Accordingly, the typical situation consists in constructing a DM that mimics a DDE, 
i.e. we have to deal with the problem of passing from one to the other class of models. In 
this section, we discuss this problem, showing that the model mismatch implies that increas- 
ingly faithful reconstructions are only possible at the expense of increasing the state-space 
dimension. This can be done by lenghtening either the first or the second window of each 
pair. 

Before proceeding to the general discussion, it is important to stress that all the results 
reported in this section are derived under the assumption that (i) there exists a finite- 
dimensional attractor (this can be shown under quite general conditions ||), (ii) the attractor 
dynamics is high- dimensional. In particular, we assume that the length of the window-pairs 
is smaller than the minimal embedding dimension required by the Takens theorem. This 
is because, as explained in the introduction, we want to consider cases where the usual 
embedding techniques fail to provide a faithful reconstruction. 

For the sake of simplicity, we first consider a scalar DDE, x = H(x,x(t — To)), and 
assume that the continuous variable x(t) is recorded with a constant sampling time A on 
the discrete time-lattice to + nA with n e Z. Let us call x(n) = x(to + nA) from now on. In 
this framework we shall investigate the degree of accuracy that is possible to reach within 
the class of scalar DM-models. Let A(mi, m 2 ) be the class of analytic functions h 



Consider the DM-model 



with h e A, and 



A(mi, m 2 ) = {h : iT 1+m2 -> R}. (12) 



x(n + 1) = h(v(n;m 1 ,m 2 ,T)), (13) 



v(n; mi, m 2 , t ) = (xi(n),xi(n - 1), ... , x\{n - m x + 1); 

xi(n - To),Xi(n - To - 1), . . . ,Xi(n - r - m 2 + 1) ), (14) 



with window pairs (mi,m 2 s f\ separated by a time r . We quantify the accuracy of the 
DM-model h in Eq. ([13]) with the help of the one-step forecast error (FCE): 



(x(n + 1) - h(v(n;m h m 2 ,T))y 
^ mi ' m2 ' T) = (xW) - W (W) 

where (•) denotes a time average. 

Any model h can be geometrically seen as an (mi + m2)-dimensional manifold in the 
state space augmented by the y(n + 1) direction (we shall call it, the iS-space). The FCE 



2 We have introduced the notation (rn\,m 2 ) to emphasize that the length of the two windows may 
be different. In that respect, the definition of v contrasts with the one given in the previous section. 
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is trivially larger than zero whenever the original data lie on a manifold different from that 
one identified by the model. This is an error that can be removed by properly constructing 
the model. Conversely, if the data are distributed in a broader region, i.e. also transversally 
with respect to a hypothetical manifold, no exact model can be constructed and the FCE is 
bounded away from zero. This is precisely what we expect to happen because of the model 
mismatch: for any choice of window (mi, 777,2), the variable y(n + 1) fluctuates in a small 
but finite interval, so that the FCE cannot be smaller than the average thickness of the 
distribution of points. 

In order to clearly distinguish the latter fundamental limitation from trivial modelling 
errors, it is sufficient to define <7^(mi, m 2 , r) as 

a^(mi, m 2 , r) = min{a(h; mi, m 2 , r)\h G .A(mi, m 2 , r)}. (16) 

ct,A (mi, 7712,1") establishes the maximum level of accuracy that can be reached with a fixed 
window-system (7711,7712) and a delay time r in the class of DM-models A. From now on 
the function h G A, which minimizes the FCE, is called h and therefore (74(777.1, m 2 , r) = 
a(h] mi,m2, t). We shall see that there are at least two alternative procedures to increase 
the accuracy of the reconstruction. They amount to considering window pairs of the type 
(l,m 2 ) and (mi, 1), respectively. 

Let us first discuss the (l,m 2 ) case. Uniqueness and existence theorems || guarantee 
that the original DDE model can be written as a functional 

x(t + A)=G[x(t),{x} d ] (17) 

where {x}a = {x(t')\t — r < t' < t — r + A}. A simple example of the above functional 
dependence can be obtained in the case of the model class 

x = -fix + F(x(t - t )) (18) 

to which both the Ikeda and Mackey-Glass models belong. A formal integration of 
Eq. (H) yields 

x(t + A) = x(t)e~" A + t dt'Fixit - r + t'))e" {t '- A) (19) 

Jo 

If x(t — t + f) is nearly constant within the integration interval, one can approximate the 
functional dependence with a single value of the variable x(t — To + t') within the integration 
interval. This amounts to constructing a (1, l)-model and the uncertainty on x(t + A) is 
precisely the above introduced FCE 04(1, l,r ), which is of A 2 -order. | 

A better accuracy can be achieved if two or more consecutive points are assumed to be 
known in the vicinity of x(t — r ), since their knowledge allows constructing higher order 
approximations of F . Simple perturbative arguments suggest that the error made in the 
estimation of x(t + A) is of the order A m2+1 , if m 2 consecutive points are used (i.e., if a 



3 In this section we always assume that the delay tq is perfectly known and the uncertainty is 
entirely due to a model mismatch. 
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window-pair (1, m 2 ) is considered) and A is small enough. In fact, the problem of estimating 
the error for fixed A and m 2 large enough is absolutely non trivial and deserves a discussion 
by its own. Here, without pretending to derive asymptotic estimates on the dependence of 
(74(1, m 2 , r ) on m 2 and A, we limit ourselves to consider two limit cases. The first one 
consists in assuming that the Fourier modes above a certain frequency u c are slaved modes, 
i.e. they are uniquely determined by the amplitude of the lower-frequency modes. In this 
case, if the sampling time A < 2h/uj c , we expect the residual uncertainty on x(t+A) to vanish 
for increasing m 2 , although it is not obvious to determine how rapidly. In the opposite limit, 
we can assume that the amplitude of each high-frequency mode is an independent variable 
(as in a stochastic process). In this case, the uncertainty on x(t + A) would depend on the 
"power" contained in the Fourier spectrum above the sampling frequency uo s = 2n/A and 
would not decrease for increasing m 2 . Were this the typical condition generated by DDEs, 
one should conclude that the model mismatch is so severe that one can never reproduce a 
continuous-time dynamics with arbitrary accuracy (with the exception of m 2 larger than the 
minimal dimension required by the embedding theorems for a faithful reconstruction). 

An alternative approach for constructing a DM consists in approximating the first-order 
time derivative with linear combinations of the observable x in neighbouring points along 
the time lattice. It is well known that one can write 



for a suitable choice of the coefficients a«. Upon substituting the above expression in the 
initial DDE and solving for x(t + A), we find that x(t + A) can be expressed as a function 
of the m preceding values and 1 value one-delay unit back in time. In other words, we have 
arrived at a DM of type (m 1; 1), which involves an unavoidable error a^(mi, 1, To) — A mi+1 . 
This is again a purely perturbative result which is valid only for moderately large m's. 

In both the above discussed cases, we have seen that a discrete-time model can reproduce 
only approximately the dynamics of the original continuous-time system. In comparison to 
low- dimensional dynamical systems, for which we know that a generic ODE can be exactly 
transformed into a discrete mapping (even with the additional advantage of reducing the 
phase space dimension, if a Poincare section is taken), the above results look very mod- 
est. The main reason for such a difference is that when a DDE is turned into a DM, 
the phase-space is necessarily "compressed" from an infinite- to a finite-dimensional one. 
The compression may be practically harmless, but necessarily involves the loss of small but 
nonzero interaction terms. 

The two pairs (l,m 2 ) and (m 1 ,l) are the limit cases of the more general combination 
(m 1 ,m 2 ). We have been unable to estimate directly the uncertainty in this general case, 
because we failed to find an interpretation of the corresponding model in terms of derivatives 
and/or integrals. Nevertheless, with the help of a recursive argument we conjecture that 




1 



(20) 




(21) 



We show this by starting from a DM model of the type (mi,m 2 ), namely 



x(n + 1) = F^\v(n; mi,m 2 , r )) 



(22) 
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which we assume to be accurate up to order A mi+m2 . Moreover, we can claim that, as long 
as the distance between the two windows of a given pair is A-close to the true delay r , the 
error of the corresponding model does not change significantly. Accordingly, the dynamics 
is equally well described by the model 

x(n + l) = F (2) (v(n;m 1 ,m 2 ,T - 1)). (23) 

By now solving this latter equation with respect to x(n + 1 — mi) (assuming that no prob- 
lems connected with the invertibility of the nonlinear expression arise) and substituting in 
Eq. (p2[), we can write 

x(n + 1) = F (3) (tT(n;m! - l,m 2 + l,r )) (24) 

for some function F&K In other words, the value of x at time n + 1 can be predicted on 
the basis of a window-pair of the type (m x — 1, m 2 + 1) with the same order of magnitude 
for the uncertainty, i.e. A mi+m2 (in fact, the additional factor due to the error propagation 
in the inversion of the second equation is a finite correction term, independent of A). By 
iterating the same argument, one can eventually convince oneself that the accuracy of the 
model depends only on the total number of points in the two windows. 

Finally, we briefly discuss the general case of how to approximate multicomponent DDEs 
with multicomponent DM- models. In order to avoid technicalities, we limit ourselves to 
summarize the main steps, the whole derivation being straightforward. The generalization of 
the first approach (leading to (l,m)-models in the scalar case) confirms the naive expectation 
based on the knowledge of the scalar case, i.e., window pairs of the type (m^, ma+l) guarantee 
an error of the order A l+2 . In fact, in full analogy with the discussion of Eq. ([19|) ), one can 
conclude that the formal integration of all equations allows approximating the original DDE 
equation up to order A /+2 with a "generalized" DM, where I + 1 past values of the scalar 
feedback variable are required. A simple repetition of the arguments presented in the first 
part of this section shows that this vector map can be turned into a scalar one of the type 
(m d ,m d + I). 

In the complementary case that has led to the development of (m,l)-models in the scalar 
case, the scenario is much worse, since the derivative of each of the d variables must be 
determined with the prescribed level of accuracy. In order to fulfil this requirement, one 
must transform the original DDE into a DM involving drria variables in the first window. A 
by far larger number of variables is required as soon as d > 1. 

IV. LOCAL INDICATORS 

The most general delayed systems involve the dynamics of several components. In prac- 
tice, however, only a single scalar variable is available. Hence, it is natural to reconstruct the 
dynamics in terms of intrinsically discrete models such as DMs. This choice of model class 
is further motivated by the numerical instabilities that are known to affect the computation 
of derivatives (required in the practical implementation of DDE models). 

There are several ways to quantify the deviations from the expected dynamics in delayed 
systems, such as the filling factor p4| , p^5| , the ACE-method |Tj|, and others fl?fl . For the 
sake of simplicity, here we restrict our investigations to (m, m)-DMs, 
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y(n + 1) = h(v(n; m, r)). (25) 

In the following, we shall use the one-step forecast error a(h; m, r) = a(h; m, m, r), and its 
minimum in the set M.: o"x(m,r) = a(h;m,r), where the function h G M. minimizes 
a(h;m,r), as a tool both to identify the correct delay time and to construct a meaningful 
model. In practice, one cannot deal with such a large space like that of analytic functions 
A considered in the previous section. Accordingly, one must first identify a proper class of 
parametrized functions h to work with: 

M{m) = {h(-;a) : R 2m -+ R}, (26) 

where the parameter a is varied to minimize the FCE. The optimal choice of a specific 
class Ai(m) depends on the problem under consideration. In practice, however, local linear 
models or global models built by radial basis functions |26| are generally quite successful. 
Here, we stick to the former class. The average required by the definition fll5|) is obviously 
performed along the available time series. 

In practice, besides the fundamental limitations discussed in the previous section, several 
additional factors like finite sampling time A, measurement noise, finite number L of data, 
mismatch between the delay time Tq and the actual sampling time, i.e. mod(T , A) ^ (note 
that so far in the literature only the case of no mismatch has been discussed). While the 
effect of noise will be considered in the second part of the paper with reference to a truly 
experimental system, here we shall investigate whether the other limitations may actually 
obstruct the model reconstruction. 

The approach adopted in this section consists in identifying the optimal model h in 
M.(m) as the one minimizing a (as in the previous section) and then finding the minimum 
value of m and the appropriate value of the delay r such that the "distance" o>((m, r) of 
the reconstructed model (25) from the true dynamics is sufficiently small. 



However, it is important to notice that local closeness between the model and the true 
dynamics does not necessarily imply closeness of the global dynamics. We can see this by 
discussing the case of a grossly wrong r- value wherein we can expect that y(n + 1) is almost 
totally uncorrelated with the j/-values belonging to the second (delayed) window. Accord- 
ingly, the information content of the second window is totally irrelevant in the minimization 
procedure of FCE. By invoking, as in the previous section, the analyticity properties of the 
underlying signal y(t), we can estimate that <jm ~ A m (since knowing the value of y in m 
points is tantamount to knowing the first m derivatives). This means that the FCE can be 
made very small even in the absence of relevant information about the force field (we recall 
that the delay is assumed to be far from the correct value), a conclusion that appears utterly 
illogical. In fact, this result tells us that in the small sampling-time limit, it is possible to 
perform reasonable short term predictions by simply exploiting the smoothness of the signal 
itself. In particular, it is not even possible to distinguish the true dynamics from that of the 
naive model y^ = 0, which corresponds to polynomial dependence on time, i.e. a dynamics 
which is neither stationary nor even limited. The conclusion to be drawn from this obser- 
vation is that the smallness of <jm alone is not enough to conclude that a meaningful model 
can be extracted from the raw data. This is the reason why we devote the next section to 
the discussion of other, global, indicators which do not suffer the same problems. 

For rrid = 1 (scalar DDEs), this proved to be a very effective and numerically inexpen- 
sive strategy to detect the unknown delay time r from time series, since displays a 
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pronounced local minimum for t = tq | 20f . Before presenting the numerical data, let com- 
plete the general discussion about the FCE by comparing the previous considerations with 
the expection for r = r . From the discussion carried on in the previous section, if m > m<j, 
the FCE is at least of the order of A 2 ( m ~ md+1 ). By comparing this estimate, derived for 
the correct value of r, with the typical error expected for a generic delay, we find that the 
latter one is smaller, if m < 2(md — 1), which is a clear nonsense. Since the FCE is de- 
fined as the optimal error, whenever some prior information is given, we can conclude that 
whenever both mechanisms do apply (i.e. when r = r ), it is the most efficient one which 
determines the actual FCE. In other words, we do not expect any sensible dependence of 
crjn(m, r) on r if m < 2(m^ — 1) (preventing the detection of the delay time with the help 
of the FCE), while a clear minimum should be seen in the opposite case. We can explain 
that behavior of the FCE by noticing that the two estimates of o~m have been derived by 
invoking different mechanisms: (1) continuity of the evolution, (2) effective approximation 
of the delayed feedback model. Since both mechanisms allow for high-quality short-term 
predictions, both will lower the FCE (and supposedly any local indicator). Therefore, local 
indicators are not appropriate tools to distinguish between the the two mechanisms. Global 
indicators (as discussed in the next section) are good candidates to also detect the delay 
times for m < 2(m^ — 1). Anyway, the above inequality represents a necessary condition to 
be satisfied by a local indicator (such as the FCE) for a correct identification of the delay 
in the worst possible case. 

In the following we discuss the problem of model reconstruction with reference to the 
generalized Mackey-Glass system |24[ 



x(t) = —u 2 y(t) — px{t). 
which can be easily transformed into a second order (d = 2), neutral DDE 

y(t) = -u 2 y(t) - py(t) + u 2 f(y(t - r )) + d/(y(t ~ To)) y(t - r ). (28) 

dy{t - t ) 

The parameters are chosen as a = 3.0, p = 1.5, uj = 1.0, and To = 9.83, for which the Kaplan- 
Yorke dimension of the attractor is Dry — 7.2. For p = uj 2 and p — > oo, the above equation 
reduces to the standard Mackey-Glass system by eliminating adiabatically the variable x(t). 

For the analysis, we use a time series of the variable that is fed back, y(t), with a sampling 
time A = 0.1. Notice that with this choice of A, the retarded values y(t — r ) lie outside 
the time lattice if y(t) corresponds to one of the sampled values. In Fig. [5] portions of the 
time series and a delay plot of an extremal section y(ti) = are presented. The effect of the 
second component x in the dynamical equation (£7j) can be clearly visualized in the delay 
plot (with the delay being close to the delay time r ), since the intersection points of a scalar 
system have to lie on curve in such a representation |L4| . 



FIG. 3. (a-b) Time series of the generalized Mackey-Glass system; (c) delay plot of an extremal 
section. The values of extremal points y(ti),y(ti) = are plotted versus its retarded values. 
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We have numerically investigated the FCE <jm (m, r) with a local linear model for dif- 
ferent choices of m and r. The FCE is computed taking into account time series of length 
L(m = 1) = 50, 000; L(m = 2) = 100, 000; L(m = 3) = 200, 000. From the data reported 
in Fig. |], it is interesting to notice that a pronounced minimum of <jm is observed even for 
m — 1, when, a priori, there is no reason to expect a faithful reproduction of the original 
dynamics. Such a result is the consequence of a general feature of dissipative systems: the 
various components are not equally "active" . Indeed, as long as the attractor is highly di- 
mensional, the feedback term can be viewed as a noise term. In the absence of this "noise" 
source [^7|J , the original system reduces to an ordinary differential equation, whose attractor 
fills a manifold of dimension smaller than m^. The addition of the noise "thickens" the 
distribution along all directions in the state space, the width of the distribution depending 
on both transverse stability and noise amplitude. Accordingly, it may happen that the role 
of some components (corresponding to rather stable directions) in a multidimensional DDE 
is just to blur the distribution generated by a suitable DDE with less components. This is, 
to some extent, what happens in our system as clearly seen in Fig. 0(c), where the points 
cluster around a smooth curve which is the expected shape for the scalar Mackey-Glass 
system. A (m = l)-model will detect this curve, leading to a local minimum in the FCE. 




6.0 8.0 10.0 12.0 14.0 

X 

FIG. 4. One-step forecast error of an (m, m)-model as function of r. From top to bottom, the 
curves refer to m = 1, 2, and 3, respectively. 

The position of the minimum is an estimate f(m) of the delay time. A parabolic ap- 
proximation of the FCE around the minimum yields the data reported in the following 
table. 



j m = 1 


m = 2 


m = 3 


f(m) 9.81 ±0.1 


9.87 ±0.1 


9.72 ±0.1 



The estimated values agree with the correct value r = 9.83 within the errors due to the 
finiteness of the sampling time. 

Some comments are in order about the behaviour of the FCE. First of all, let us notice 
that a local minimum is observed also for zero delay. This minimum is due to the fact that 
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we pass from a system of two windows of length m to a single window of double length. 
According to the arguments put forward in the first part of this section, we have to expect 
an accuracy of the same order as for the leading minimum. However, this accuracy cannot 
correspond to an equivalent accuracy of the global dynamics as the information about the 
feedback is missing. Moreover, we notice that the plateaus of the various curves decrease 
for increasing m. This is qualitatively in agreement with the considerations presented in 
the first part of this section about grossly wrong delays. However, there is no quantitative 
agreement about the scaling dependence on m: we attribute this to the existence of residual 
correlations between y values even when they are some time units apart. 

Analogous considerations can be made for the height of the minimum that decreases less 
than expected on the basis of the general considerations discussed in the previous section. 
In this case, we have identified in the accuracy of the local linear model and in the finiteness 
of the number of points the main limiting factors which prevent cjm from being smaller for 
m = 3. 

V. GLOBAL INDICATORS 

We have seen that the FCE is a useful tool to decide whether a reconstructed model h 
is locally close to the observed dynamics. Nevertheless, there is neither guarantee that the 
model dynamics remains confined to the region where it has been originally defined (e.g., that 
it does not explode) nor that it does not converge to a smaller subset (e.g., a fixed point or 
a limit cycle). In other words, the smallness of the FCE cm is a necessary but not sufficient 
condition to establish whether a given model provides a globally faithful reconstruction. To 
test this, we iterate the models h for different m-values and the optimal choice of the delay 
time, r = f (m), to generate some typical time series {y m }. We consider a model as valid, 
if the resulting attractor is "close" to the original one. To this aim, we introduce and utilize 
the cross forecast error, compute the power spectrum, the probability distribution of the 
sampled variable, and the Lyapunov spectrum as tools to establish altogether the validity 
of a given model. 

However, before discussing all such indicators, it is instructive to perform a qualitative 
analysis of the generalized Mackey- Glass system ( p7|) for m = 1, 2, 3 and the optimal choice 
of the delay time (as identified in the previous section). The resulting time series {y m } (of 
length L = 100, 000) reveal a qualitative good agreement with the original series only if 
m > 2. Indeed, for m — 1, the time series y± is asymptotically attracted to either a strictly 
positive or strictly negative region (see Fig. |5|) 

FIG. 5. Iterated time series {y m } of the (m = l)-model: (a) Convergence to an attractor with 
purely positive values; (b) blow-up of the attractor dynamics; (c) delay plot of an extremal section. 

Indeed, attractors with a specific sign exist in the standard Mackey-Glass system, where 
the unstable fixed point y = acts as an "impenetrable" domain boundary separating the 
two coexisting attractors (changes of sign can exist only if they are present in the initial 
condition; during the evolution, once disappeared, they cannot be generated again). Since, 
for m — 1, the second variable is obviously absent, it is not surprising that the reconstructed 
dynamics exhibits typical features of the standard Mackey-Glass system. 
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As this eventual convergence towards either positive or negative values persists, indepen- 
dently of the accuracy used in the model reconstruction and of the number of data points, 
we must rule out the possibility that = 1, i.e. that a minimal approximate model can 
be constructed with just one component. 

The problem of quantifying the "closeness" between the original time series y and the 
iterated time series y m cannot be faced by measuring to what extent the model-generated 
time-series remains all the way close to the original time-series. In fact, because of the 
chaotic properties of the evolution, an exponential separation always occurs which can hide 
the statistical equivalence of the two time-series. The most appropriate approach would 
consist in definying and measuring the distance between the two probability distributions. 
The natural space where this question should be formulated is the ( 2m +1) -dimensional space 
S introduced in Sec. Ill, i.e. the same space where the FCE is estimated and the dynamical 
rule reconstructed. There are various ways to define a distance, such as the Kullback- 
Leibler information IF2H 



or the cross correlation sum ||29|| . Unfortunately, a meaningful 
implementation is a rather delicate matter. For instance, in the case of the Kullback-Leibler 
information, one needs sufficiently many data to get rid of statistical fluctuations in the 
local probabilities. Therefore, we have preferred to introduce a more robust geometrical 
indicator which, although carrying less information, can be satisfactorily complemented 
with the implementation of other tools. 

For any point P G {y}, determined by following the original trajectory, we identify the 
closest template used to construct the local linear model along the iterated time series {y m } 
(after a suitable transient) and measure the distance d(n) of P from such a 2m-dimensional 
surface. By averaging the square distances over all points P, we finally obtain the global 
indicator 



X{m) 



1 



\ N — n (y 2 ) - (y)'< 



(29) 



where n Q is such that the components of v(n; m, r) are all in {y(n)}. The definition of x( m ) 
is essentially the average forecast error along the time series {y} on the basis of a model 
of the time series {y m }, the latter being restricted to the attractor region (cross forecast 
error PU[). 

The results are presented in the following table: 



m 


X(m) 


1 


1.197 


2 


0.032 


3 


0.026 



As a result of the confinement of the dynamics to an attractor with purely positive values, 
the distance x( m ) is large for m = 1 (actually, so large that it compares with the standard 
deviation of the data). For m > 2, x(m) decreases substantially (and could be further 
reduced by increasing the number of data points). Accordingly, the minimal choice = 1 
does not yield a faithful reconstruction, while the hypothesis irtd = 2 is already sufficiently 
good to be almost indistinguishable from further refinements (with the reasonable amount 
of data points adopted in our simulations). 
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Nevertheless, as already anticipated, such an indicator does not necessarily give a definite 
answer. In fact, we can imagine two distributions with the same support but grossly different 
densities. A geometrical indicator such as x{ m ) would likely fail to identify at once such 
important differences since small distances would be found for all points (only a finer analysis 
could possible allow detecting an insufficient quality of the reconstruction). 




0.0 0.1 0.2 0.3 0.4 0.5 

f 

FIG. 6. Power spectra of the original time series of the generalized Mackey-Glass system, the 
(m = 2)-model, and the (m = 3)-model. The inset shows a blow-up for low frequencies. 

Therefore, we have decided to compute other quantities which have also a direct physical 
meaning. In Fig. |6] and Fig. [7|, we compare the spectra and the histograms of the original 
time series and of the iterated time series fo, §3- A good agreement is achieved in both cases. 
Since no significant improvements are found in going from m = 2 to m = 3, we can confirm 
the previous conjecture that = 2 is the minimal number of components necessary for a 
good reconstruction. 
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FIG. 7. Histograms of the original time series of the generalized Mackey-Glass system, the 
(m = 2)-model, and the (m = 3)-model. 

As a consequence of a successful modelling, it is not only possible to forecast the evolution 
in the real space, but also to extract information about the tangent space. In particular, one 
can compute the Lyapunov spectrum (LS) JHJ for different choices of m (in correspondence 
of the optimal value of the delay). We expect that the LS grossly differs from the correct 
one whenever m is chosen too small, so that we can use Lyapunov exponents as a further 
global indicator to judge the quality of the reconstructed model. 

In our example of the generalized Mackey-Glass equation, we estimated the LS for m = 
1, 2, and 3. The results are compared with the estimation of the spectrum obtained by direct 
integration of the equations (see Fig. (g)). Again we observe large deviations for m = 1, 
while for m > 2, the LS is rather close to the true spectrum, thus confirming once more the 
scenario suggested by the other indicators. 
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FIG. 8. Lyapunov spectra of the generalized Mackey-Glass system as estimated from the equa- 
tions (solid line); from the (m = l)-model (dashed line); from the (m = 2)-model (dotted line); 
from the (m = 3)-model (dot-dashed line). The inset shows a blow-up of the largest Lyapunov 
exponents. 

While it is not the goal of this paper to derive general quantitative estimates of the various 
sources of errors, we would finally like to draw the attention of the reader on the effect of 
dynamical noise that, to different extent, affects any experiment. Dynamical noise, even 
more strongly than measurement noise, can mask the presence of hidden degrees of freedom. 
Once more, the Mackey-Glass system is a simple system to illustrate this phenomenon, 
since additional noise in the scalar model can induce jumps from positive to negative values 
of the variable y, thus making the evolution essentially indistinguishable from that of a 
noisy generalized system. As a consequence, the problem of a correct identification of the 
deterministic components depends on the possible/required accuracy of the modellization. 
In some cases, it might even be desirable to model only the gross nonlinear features in 
terms of a few variables while assimilating all the others to a sort of background noise 
indistinguishable from the true noise. 

We conclude with a remark about the length of the time series. It has been emphasized 
by some authors that the number of points necessary to estimate the delay time in a scalar 
system series can be quite small (say 500-1000) compared to the number of data points 
required in conventional nonlinear time series analysis. In principle, we can confirm this 
result for multi-component systems and a single time-delay feedback. On the one hand, the 
discovery of hidden variables requires embedding the data in spaces of increasingly higher 
dimensions (though much smaller than the dimension of the entire phase space) and thus 
an increasingly larger number of data points. On the other hand, since we do not aim at 
detecting scale invariant properties, the number of points required to obtain statistically 
significant results for the estimation of the delay time can be comparably small. 
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VI. OPEN PROBLEMS 



In this paper we have shown that time-delayed feedback systems can be investigated on 
the basis of a single valued time series. In particular, we have seen that the dynamics can 
be reconstructed in low-dimensional state-spaces even when the attractor dimension can be 
arbitrarily large. This result has been obtained by restricting ourselves to the case where 
only one variable is fed back with a fixed specific delay time To- Two possible generalizations 
of this setup can be conceived that might be of interest in practical applications. 

First, one can assume that the single feedback variable acts with several different delays. 
As long as the number of such interactions is finite, no dramtic changes are expected from 
a theoretical point of view: instead of working with a "two-window" embedding, it should 
suffice to to use an (n + l)-window embedding, where n is the number of delays. This 
is a straighforward generalization, as long as the windows do not overlap. Of course, the 
advantage of a low dimensionality of the state space is lost as soon as n becomes large, 
but for only a few delays it might still work reasonably well. Completely different is the 
situation when we have to deal with a continuous spectrum of delays. In this case we expect 
this method to fail, as it is no longer possible to reconstruct the equations of motion in a 
low-dimensional manifold. 

A second possible generalization consists in sticking to a single delay time r , but admit- 
ting that several variables are fed back. This is similar to the case where we measure the 
wrong variable, as only a scalar variable is used to reconstruct the dynamics. This suggests 
that the length of the two windows should be increased by some factor in this case. 

Another open problem concerns the uncertainty affecting a DM model that arises from 
the model mismatch due to the supposedly continuous-time dynamics. In this paper, we 
have employed perturbative arguments to estimate the order of magnitude of the FCE, when 
the windows are not too long. However, this is still insufficient to draw mathematically rig- 
orous conclusions about the convergence properties of DM models towards the expected 
continuous-time limit. In fact, a non-perturbative approach is presumably necessary to deal 
with large window-lengths, besides the inclusion of additional information about the dynam- 
ical behaviour of the process under investigation. This is a hard task that extends a general 
and still unsolved problem: that of estimating the indeterminacy of an optimal prediction 
(on the basis of the standard embedding approach) for a high-dimensional deterministic 
process. 
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